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0.   Introduction  and  Summary. 

This  paper  presents  some  new  mathematical  and  interpretive 
material  on  concepts  of  statistical  evidence.  The  material  has 
been  given  a  self-contained  elementary  expository  form  (restricted 
to  the  case  of  discrete  probability  distributions)  suitable  for 
early  inclusion  in  any  mathematical  statistics  course  hov;ever 
elementary.  Part  I  (Sections  1-6)  in  particular  is  a  brief 
but  rounded  elementary  account  of  "the  concept  of  statistical 
evidence  and  the  anomalous  problem  of  its  interpretation." 
Part  II  presents  further  axioms  for  evidence  and  derivations  of 
their  Interrelations  in  elementary  form;  its  Sections  7-9  present 
concepts  v;hich  have  figured  significantly  in  the  development  of 
statistical  thinking,  and  are  a  basis  for  a  reading  of  Part  III. 
The  latter  contains  general  discussion  intended  to  give  even  serious 
beginning  students,  among  others,  some  suggestions  for  perspec- 
tives on  the  broad  field  of  statistical  theories  and  the 
historical  pattern  of  their  development. 

The  axioms  of  statistical  evidence  and  some  related  concepts 
discussed,  vilth  their  mathematical  interrelations,  are  summarized 
schematically  in  Figure  1,  p.  17a.       The  conceptual  issues 
and  historical  patterns  discussed  are  indicated  schematically  in 
the  Figure  2,  p.  55a. 


PART  I.  The  Concept  of  Statistical  Evidence  and  the  Anomalous 
Problem  of  its  Interpretation. 

1.  Models  of  Statistical  Evidence.  Let  E  denote  any  specified 
model  of  a  (discrete)  statistical  experiment,  in  v;hich  the  random 
outcome  X  takes  values  in  the  sample  space  S  containing  points 
x-^fX^,  , . . ,      vjith  (elementary)  probability  function 


f(x,e)  =  Prob  (X=  x|9)  , 


X  €  S  , 


where  the  parameter  point  Q  lies  in  the  parameter  space  O.   In 
case  both  S  and  O  are  finite,  the  model  may  be  represented 
conveniently  by  a  stochastic  matrix: 


E  =  (d.  .)  = 
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where 


•e  p.  .  =  Prob  (X=j  |  0=i),  j  =  1,  . . . ,  J,  i  =  1,  . .  .,1.  An 
example  is  the  experiment 
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Any  instance  of  statistical  evidence  is  represented  by  a 
model  of  the  form  (E,x),  where  E  is  a  specified  experiment  and 
x  is  a  specified  observed  outcome  of  E.  An  example  is 

(E]^,2)      or     ((  )   '  2 
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which  represents  that  X  =  2  v;as  observed  as  the  outcome  of  the 
experiment  E-,. 

The  properties,  concepts,  interpretations,  and  uses  appropriate 
to  statistical  evidence  In  various  cases  have  often  been  found 
obscure  and  controversial.  Various  intuitive  and  mathematical 
concepts  and  objective  criteria  have  been  proposed  toward  adequate 
characterization  and  Interpretation  of  statistical  evidence.   To 
discuss  these,  it  is  sometimes  convenient  to  write  Ev(E,x)  to 
denote  a  concept  or  interpretation  of  the  statistical  evidence 
(E,x)  or  the  "evidential  meaning"  of  (E,x) .  Among  these  concepts, 
the  simplest  and  intuitively  most  appealing  seem  to  have  a  negative 
character,  referring  to  aspects  of  evidence  descrlbable  as 
"Irrelevant, "  "uninformative, "  or  "like  'noise'"  (in  the  communication 
theory  sense) . 

2.   A  Concept  of  "Irrelevant  Iloise,"  (N) . 

Consider  any  discrete  experiment  E,  with  probability  function 
f(x,9),  X  €  S,  Q  e  O.  Let  x'  be  a  specified  possible  outcome  of  E. 
Let  Z  be  an  auxiliary  random  variable  (independent  of  X)  taking 
the  values  1,  0,  with  known  respective  probabilities  c,  1-c. 
Let  Y  be  the  random  variable  defined  by 


:,z)  =  \ 


'  1,  if  X  =  x'  and  z  =  1, 


y  =  y(Xj 

0,  otherwise. 
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Let  E'  be  the  experiment  whose  outcomes  are  just  the  observed 
values  -^   of  the  random  variable  Y.   (E  may  be  described  as  a 
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"stochastically  censored"  version  of  E.  E  has  p.d.f.  f"(y,9), 
Geo,  given  by  f*(l,0)  =  c  f(x',e),  f*(O,0)  =  1-c  f(x',9)> 
thus  E*  is  characterized  fully  by  the  probability  c  and  the  function 
f(x',©)*  G  6  O  .  Although  an  experiment  of  the  form  E"  need  not 
be  considered  in  relation  to  the  experiment  E,  it  is  possible  to 
realize  E  by  use  of  E  as  indicated. 

Let  us  consider  Ev(E  ,1),  when  E  is  considered  as  having 
been  actually  realized  by  use  of  E.  Upon  observing  the  outcome 
Y  =  y(x,z)  =  1  of  E  one  can  immediately  deduce,  from  the  defini- 
tion of  y(x,z),  that  outcome  x'  of  E  has  occurred;  and  thus  one 
Is  in  the  same  position  as  an  experimenter  who  has  carried  out  E 
and  observed  x',  except  in  the  hypothetical  respect  that,  JJ^  x' 
had  not  occurred  In  E,  one  v/ould  in  general  have  obtained  only 
incomplete  information  concerning  the  outcome  x  of  E.  Many 
statisticians  find  it  in  accord  with  their  concepts  of  evidence 
to  consider  this  hypothetical  distinction  irrelevant;  and  they 
consequently  consider  Ev(E  ,1  )  and  Ev(E,x')  as  equivalent  in 
such  cases. 

The  concept  that  such  hypothetical  distinctions  are  irrelevant 
to  the  evidence  in  question  may  be  expressed  informally  as 
"irrelevance  of  stochastic  censoring  which  might  have,  but  in  fact 
did  not,  affect  an  observed  outcome."  This  concept  may  be 
formulated  as: 

(N):  Axiom  of  irrelevant  noise.   Let  E  be  any  (discrete)  experiment, 
with  probability  function  f{x,©),  ©  e  O,  Let  Z  be  an 
auxiliary  random  variable  independent  of  X,  taking  values 
1,  0,  with  respective  toown  probabilities  c,  1-c  (independent 
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of  0).  Let  x'  be  any  specified  possible  outomce  of  E. 
Let  Y  be  the  random  variable  defined  by  y  =  j{x,z)   =   1 
if  X  =  x'  and  z  =  1,  and  y  =  0  otherwise.  Let  E"  be  the 
experiment  in  which  Y  is  observed  (but  not  X  nor  Z). 
Then  Ev(E*,l)  =  Ev(E,x'). 
An  example  of  the  simplest  sort  is: 


/.9   .1^' 


El  =   (  .     .   )  ,    c  =1  ,   X'  =  1  , 
.7^ 


from  which  ue  determine 

E*  =  f        1. 
^.2   .8'-' 

Here  (N)  implies  Ev(Ei,l)  =  Ev(e",1). 

3.   The  Likelihood  Axiom,  (L) . 

Another  concept  of  statistical  evidence,  ivhose  intuitive 

and  historical  background  is  discussed  in  later  sections,  is 

expressed  in: 

(L):  The  likelihood  axiom:   Let  (E,x'),  (E*,y')  be  any 

instances  of  statistical  evidence  with  common  parameter 
space  O,  and  with  probability  functions  such  that  for 
some  positive  c,  f(x',9)  =  c  f*(y ',€"),  9  €  O.  Then 
Ev(E,x')  =  EviE^jY*), 
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Lenima.   (N)  is  equivalent  to  (L) . 

Proof.  Under  the  conditions  in  the  statement  of  (N),  the  probabil- 
ity function  of  outcome  Y  =  1  of  E  is  given  by  f  (1,0)  =  c  f(x',0), 
0  e  Q.  But  the  last  relation  is  the  condition  in  (L) .  Hence 
when  (L)  is  assumed  vje  have  Ev(E,x')  =  Ev(E  ,1),  v;hich  is  the 
conclusion  of  (N) . 

To  prove  the  converse,  as  in  the  condition  of  (L)  let 
(E,,x-|),  (Ep,Xp),  be  any  t\io   instances  of  statistical  evidence, 
with  common  parameter  space  O  and  with  probability  functions 
f^(x,,9),  fp(Xp,9),  such  that  for  some  positive  c,  f-,(x-,,9)  = 
c  fp(xp,0),  0  €  O.  Without  loss  of  generality  vie   may  assume  that 
c  _<  1  (since  otherwise  we  could  write  f2(Xp,9)  =  (l/c)fn  (x' ,9) , 

where  (1/c)  <   1).  Let  E-,  be  the  "stochastically  censored" 

I 

version  of  E^,  determined  by  taking  the  specified  outcome  x^  and 

the  specified  probability  c,^  =  1;  thus  E,  is  characterized  by 
f^(l,9)  =  f^(x^,9),  9  €  O.  Similarly  let  E^  be  determined  from 
Ep,  Xg,  and  Cp  =  cj  E  is  characterized  by  f2(l,9)  =  c  fp(Xp,9), 
6  €  O.  We  observe  that  f, (1,9)  =  fp(l,9),  since  by  assumption 
f^(x^,9)  =  c  f2(x2,9),  9  €  O.   Thus  (E^,l)  and  (E2,l)  are 
mathematically  identical,  vjhence  Ev(E.  ,1)  =  Ev(Ep,l).  Further, 
by  (N)  we  have  Ev(E^,l)  =  Ev(E,,x^)  and  Ev(E2>l)  =  Ev(E2,X2). 
Thus  Ev(E,,xJ)  =  Ev(Ep,X2),  which  is  the  conclusion  of  (L), 
completing  the  proof. 

This  concept  of  "irrelevant  noise"  and  its  implications  was 
introduced  by  the  present  writer  (19^1,  pp.  4l8-9,  4;0),  with 
examples  in  terms  of  communication  channels. 
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Axiom  (L)  may  be  conveniently  stated  in  terms  of  the 
likelihood  ratio  function  7,(9^, Q;x)   =  f  (x,G2)/f  (x,0,  ) , 
0,,  Op  €  O,  determined  by  any  instance  (E,x)  of  statistical 
evidence:   (L)  is  the  assertion  that  if  (E,x')  and  (E  ,y') 
determine  the  same  likelihood  ratio  function,  then 
Ev(E,x')  =  Ev(E',y').  The  likelihood  ratio  function  is  somev;hat 
redundant  and  is  represented  more  conveniently  for  most  theoretical 
and  practical  purposes  by  fi::ing  0^  at  any  value  for  vjhich  f(x,©^) 
is  positive  and  finite,  or  more  generally  by  replacing  in  7\   the 
factor  l/f(x,9^)  by  an  arbitrary  positive  constant  c;  the  result- 
ing function  c  f(x,9),  ©  6  O,  is  called  (a  representation  of) 
the  likelihood  function. 

4.   Interpretations  of  Likelihood  Functions  in  the  Binary  Case. 

In  the  case  of  binary  experiments,  that  is,  experiments  with 
a  parameter  space  of  just  tuo  points,  ©  or  i  =  1;,2,  the  likelihood 
function  is  conveniently  represented  by  the  likelihood  ratio 
statistic  7\   =  'X(x)  =  f (x,2)/f (x,l),  x  e  S.  For  example  in  the 

/.9   .1\ 
experiment  E,  =   (        I  ,  the  possible  outcomes  j  =  1,  2, 

determine  respectively  the  likelihood  ratios  7^(1)  =  .9/«5  =  5 
and  A (2)  =  .1/.?  =  1/7-   If  the  likelihood  axiom  (L)  is  accepted 
as  a  characterization  of  the  evidential  meaning  Ev(E,x)  in  any 
binary  case,  then  Ev(E,x)  is  characterized  fully  by  the  number 
?\(x),  without  other  reference  to  the  form  of  (E,x).  The  problem 
of  interpretation  of  such  numbers  as  representations  of  instances 
of  statistical  evidence  has  been  discussed  in  detail  by  the  present 
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writer  (19^1^  19^2)  and  others  cited  there. 

All  considerations  seem  to  support  not  only  the  high 
plausibility  but  the  clear  appropriateness  of  a  single  mode  of 
interpretation  of  likelihood  functions  in  the  binary  case,  namely, 
interpretations  in  which  numerical  likelihood  ratios  as  such  are 
taken  as  Indices  of  evidential  meaning;  and  in  vjhich  strength  of 
evidential  support  for  1  =  2  as  against  1=1  increases  as  A  takes 
higher  values  on  the  continuous  scale  from  0  to  w  ;  with  A  =  1 
representing  neutral  evidence  equivalent  to  no  evidence;  and 
7v  =  00  (or  0)  representing  evidence  of  the  greatest  possible 
strength  (virtually  comparable  to  the  strength  of  deductive 
logical  evidence)  for  1  =  2  as  against  1=1  (or  vice  versa). 

4.1.   Error-Probabilities  and  the  Concept  of  Unbiased  Evidential 
Interpretations^  (U) . 

One  of  these  considerations  (supporting  or  confirming  the 
adequacy  of  this  mode  of  evidential  interpretations)  is  based 
upon  the  concept  of  error-probabilities.   It  seems  to  many, 
including  the  present  writer,  that  one  very  appropriate  minimum 
requirement  for  the  adequacy  of  any  mode  of  characterizing  and 
Interpreting  statistical  evidence  is  a  requirement  in  terms  of 
error-probabilities,  which  may  be  ejcpressed  as: 

(U):  Unblasedness  criterion  for  a  mode  of  evidential  interpretations: 
Systematicallj'-  misleading  or  inappropriate  interpretations 
shall  be  impossible;  that  is,  under  no  9'  shall  there  be 
high  probability  of  outcomes  Interpreted  as  "strong  evidence 
against  Qi ." 
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(U)  differs  in  several  respects  from  the  axioms  for  statistical 
evidence  formulated  above  and  below.  Each  axiom  specifies  the 
equivalence  of  certain  Instances  of  evidential  meanings  and  inter- 
pretations, without  referring  to  the  form  or  nature  of  the  latter 
concept.   (U)  does  refer  to  these,  v/lth  a  measure  of  vagueness 
which  is  necessary  at  least  at  this  stage  of  the  present  discussion, 
but  with  sufficient  precision  to  allow  clear  demonstration  that  the 
criterion  (U)  is  met  very  vjell  by  (L)  and  the  mode  of  interpreting 
"K   values  indicated  above. 

The  essential  reason  for  this  adequacy  is  seen  clearly  in 
the  very  definition  of  A(x)  =  f  (x,2)/f  ( x,l) :   For  7,   values  which 
are  much  larger  than  unity  (e.g.  A  _>  100),  which  v;ould  be  inter- 
preted as  relatively  strong  evidence  for  i  =  2  as  against  1=1, 
the  probability  in  any  experiment  whatever  is  at  least  100  times 
as  large  under  1=2  (the  case  in  which  evidence  thus  interpreted 
is  highly  appropriate  and  desirable)  as  under  1=1  (the  case  in 
which  evidence  thus  interpreted  is  highly  Inappropriate  and 
misleading.  A  parallel  comment  applies  to  A  values  much  smaller 
than  unity  (e.g.  A  j<  .01).   And  A  values  not  far  from  unity,  v;hlch 
would  be  interpreted  as  weak  evidence  as  between  1=1  and  1=2, 
have,  in  any  ejcperiment  whatever,  probabilities  which  are  not 
far  from  unity  in  ratio;  for  example,  the  probability  of  A  =  1, 
which  would  be  interpreted  as  strictly  neutral  or  uninformatlve 
evidence,  is  the  same  under  i  =  1  and  1  =  2  in  every  experiment. 
(U)  is  satisfied  since  the  definition  of  A(x)  leads  directly  to 
bounds  such  as  Pr(A  jf  .0lli=2)  <  (.01)  Pr (A  <.01 1 1=1)  <  .01. 
Of  course  experiments  in  vjhich  weak  evidence  has  small  probabilities 
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are  preferable.  Further  detailed  examples  and  interpretations 
were  given  by  the  writer  (I96I,  I962) , 

Not  only  does  (L),  together  XMith  such  evidential  interpreta- 
tions of  A  values,  meet  the  condition  of  adequacy  suggested  by  the 
general  error-probability  concept,  but  it  does  so  in  a  manner 
which  is  free  of  the  obscure,  ai;lcward,  or  implausible  features 
found  in  alternative,  more  standard  modes  of  evidential  interpreta- 
tions: 

In  the  NeiTnan-Pearson  approach,  which  takes  error-probabilities 
as  its  single  basic  concept,  the  requirement  to  fix  a  Type  I 
error-probability  in  order  to  determine  a  Type  II  error-probability 
forces  upon  evidence  the  unnatural  dichotomy  into  "rejection"  and 
other.   In  fact  common  practice  modifiesor  ignores  this 
dichotomization  in  favor  of  the  more  plausible  reporting  of 
P-levels  lying  in  the  range  of  possible  Type  I  error-probabilities, 
tjrpically  vjithout  reference  to  associated  Type   II  error-probabili- 
ties.  In  neither  case  does  the  standard  testing  approach  include 
any  definite  concepts  of  evidential  interpretation  associated  with 
error-probabilities.  Neyman  himself  has  been  foremost  in  insisting 
that  standard  statistical  methods  such  as  estimation  and  testing 
cannot  appropriately  be  interpreted  as  methods  of  inference  in  the 
sense  of  evidential  interpretations  (e.g.  1957>  I962) .  Neverthe- 
less the  latter  interpretation  is  taken  as  basic,  implicitly  if 
not  explicitly,  in  a  major  part  of  exposition  and  application  of 
the  Neyman-Pearson  theory.  Such  typical  interpretations  are  made 
clearly  explicit  in,  for  example,  the  modern  textbook  bj/-  Walker 
and  Lev  (1955^  P*  5^)  where  an  interpreted  term  distinct  from 
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probability,  called  "confidencey  is  introduced  to  represent  the 
evidential  interpretation  of  a  typical  estimate  given  by  the 
NejTnan-Pearson  theory:   "To  distinguish  a  confidence  statement 
from  a  probability  statement,  this  text  will  use  the  notation 
Conf(.05  <  P  <  .65)  =  .95  .   ...  This  may  be  expressed  in  words 
thus:   "i./e  have  confidence  .95  that  the  unknown  proportion  lies 
between  .05  and  .55"  or  "Ue  assert  that  P  is  not  smaller  than  .05 
and  not  larger  than  .65  and  v.'e  have  arrived  at  these  numbers  by 
a  procedure  vjhich  if  applied  repeatedly  vjould  j^ield  interval 
estimates  of  vjhich  5/0  would  not  contain  th.e  true  value..." 
The  presentation  as  alternatives  of  these  tv;o  interpretations  of 
the  nev;  term  also  makes  explicit  the  vjidespread  view  that  the 
basic  error-probability  property  stated  last  is  tantamount  to  or 
at  least  v;arrants  an  evidential  interpretation  involving  concepts 
other  than  error-probabilities. 

Finally,  we  note  that  the  approach  based  exclusively  upon 
error-probabilities  lacks  even  a  basis  for  confronting  such 
plausible  concepts  of  evidence  as  that  of  "irrelevant  noise," 
(N),  introduced  above  (or  the  weaker  "sufficiency"  concept  (S) 
discussed  belovj) . 

5.   Interpretations  of  likelihood  functions  in  the  general  case. 

The  only  mode  which  has  been  suggested  for  evidential  interpre- 
tation of  likelihood  functions  in  general  is  due  to  Fisher  (e.g.1956) 
and  Barnard  (e.g.lQo 2). Thjls includes  the  Interpretations  described 
above  for  the  binary  case,  v;here  it  appeared  eminentlj^  satisfactory. 
But  in  the  general  case  it  appears  crucially  incomplete  and 
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Inadequate  in  several  respects. 

Consider  the  e:cample  oi"  a  single  observation  on  X,  assumed 
normally  distributed  with  unlaiov/n  mean  }i  and  standard  deviation 
a,    -00  <  (1  <  CO,  a  >  0.  Any  observed  value  x  determines  a  likeli- 
hood function  (represented,  up  to  an  arbitrary  positive  constant 
factor,  by) 

1        12 

L(x,iJ,,a)  =  —  exp 5  (\x-x)      ,        -  co  <  i.t  <  00  ,  a  >  0. 

°      2a 
For  each  arbitrarily  large  finite  number  M,  and  each  arbitrarily 
small  positive  m  <  M,  L{x,\i,a)    >   M  occurs  only  on  (a  subset  of 
the)  parameter  points  for  uhich 

^  ^^M  and     X  -  Kj^  <  H  <  X  +  Kj^  , 

t 

where  Ky.,   K„  are  numbers  depending  Just  on  M;  and  L(x,!J,,a)  <   m 

occurs  on  every  parameter  point  for  which  a  >  k  ,  where  k 

^        *  —  m'       m 

depends  Just  on  m.  The  ratio  I^/k  can  be  made  arbitrarily  large 
by  suitable  choice  of  M  and  m.     All  considerations  toward  a 
plausible  mode  of  evidential  interpretation  of  likelihood  functions 
in  general  seem  to  suggest  that  in  such  a  case  small  values  o  <   k 
are  to  be  considered  as  supported  by  the  observed  statistical 
evidence,  as  against  large  values  o  >  Ky,   (possibly  with  addition 
of  considerations  linking  certain  [x  values  to  certain  a   values), 
with  a  strength  which  is  very  high  v;ith  suitable  values  of  M  and  m. 
However  consider  any  fixed  M,  m,  (j, '  and  a'   >   K,.;  for  this  parameter 
point  the  probability  is  unity  of  an  observation  x  and  a  likeli- 
hood function  L  of  a  form  leading  to  the  evidential  interpretations 
just  described,  which  must  evidently  be  considered  strongly 


bev 


SXV9! 


. :';;  J  •>  ."1  q s 9  ".c   I B '2 e  V ?  i; ■    r; i'   s 


^     0-.     >      n    >     -;     _  ^  (X- 


13 


misleading  and  inappropriate  when  this  parameter  point  is  true. 

(All  interpreted  features  of  this  example  could  be  duplicated 
in  a  discrete  example,  e.g.  a  family  of  discrete  distributions  of 
X  closely  approximating  the  respective  normal  distributions  of 
the  example.) 

Thus  in  the  general  case,  unlike  the  binary  case,  the  sole 
suggested  (Fisher-Barnard)  mode  of  evidential  interpretations 
compatible  vjlth  (L)  fails  grossly  to  satisfy  the  adequacy 
condition  (U)  suggested  by  the  Neyman-Pearson  concept  of 
error-probabilities.  Analogous  conclusions  were  supported  by 
Neyman  (19^2;  2nd.  ed.,  1952),  Armitage  (in  Smith  (19^1),  pp.  32- 
Jih) ,   and  Stein  (1962)  by  consideration  of  experiments  of  more 
complicated  forms  (asymptotic  or  sequential). 

(The  concepts  of  intrinsic  confidence  and  significance  levels 
discussed  by  the  present  writer  (I962,  Part  II)  seem  to  have  at 
most  heuristic,  but  not  substantial,  value  for  evidential  interpre- 
tations. \!e   exclude  from  tl.e  present  discussion  the  Eayesian 
approach,  in  which  a  characterization  of  evidence  formally  like 
(L)  is  deduced  from  other  concepts,  but  in  which  it  may  be  said 
that  no  autonomous  role  is  played  by  a  concept  of  empirical 
statistical  evidence.  Some  difficulties  of  interpretation  of 
likelihood  functions  are  to  be  avoided, according  to  Fisher  and 
Barnard,  by  placing  limitations  on  the  scope  of  (L)  and  by 
alternative  use  of  the  approach  of  fiducial  probability.  But  to 
many,  including  the  present  writer,  such  limitations  and  alterna- 
tives seem  beset  with  even  greater  difficulties  and  obscurities 
than  those  besetting  the  likelihood  approach,  and  furthermore  to 
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be  incompatible  with  the  likelihood  concept  rather  than  comple- 
mentary to  it . ) 

6.   The  Anomaly  of  the  Empirical  Concept  of  Statistical  Evidence. 

Empirical  evidence  generally  and  statistical  evidence  in 
particular  are  integral  parts  of  the  structure,  practice,  and 
process  of  science.  This  fact  is  represented  from  the  standpoint 
of  the  vjorking  scientific  research  investigator  by  IJllson  (1952), 
for  example;  from  the  standpoint  of  broad  logical  and  philosophical 
analysis  of  the  structure  of  science,  by  Nagel  (I96I),  for 
example;  and  from  the  standpoint  of  mathematical  statistics,  by 
a  major  portion  of  its  literature,  both  basic  and  expository, 
and  of  general  practice  of  its  applications.  Throughout  much 
of  this  braod  bodj^  of  thought  and  practice  there  has  run  a 
seriously-held  though  often  tacit  assumption  that  there  exist, 
actually  or  at  least  potentially,  concepts  of  empirical  evidence 
and  of  statistical  evidence  adequately  clear  and  appropriate  to 
account  for  the  nature  and  importance  of  evidence  in  the  structure 
and  process  of  science.  Concerning  empirical  statistical  evidence 
in  particular,  undoubtedly  one  of  the  strongest  forces  in  the 
development  of  mathematical  statistics  has  been  its  accepted 
specialized  but  significant  responsibility  for  clarifying  basic 
concepts  (as  x.'ell  as  developing  vjorking  techniques)  of  empirical 
statistical  evidence,  as  such  and  in  relation  to  other  factors 
in  the  process  of  scientific  vjork. 

In  connection  with  the  more  general  concept  of  empirical 
evidence,  it  is  seen  in  works  like  those  of  Wilson  and  Nagel 
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cited  and  in  the  general  practice  of  scientific  work  that  there 
do  not  novi   exist  very  precise  concepts  of  the  nature  and  role  of 
empirical  evidence  in  science;  for  example,  no  very  precise  account 
can  be  given  of  the  articulation  of  empirical  evidence  with  the 
several  other  ingredients  in  the  structure  of  an  empirical  law 
or  theory.  And  much  experience  and  thought  leaves  unsupported  the 
view  that  clear  and  precise  concepts  of  empirical  evidence  do 
exist  somehow  Implicitly  or  tacitly  in  science  or  that  they  are 
likely  to  be  discovered  or  developed. 

Concerning  the  more  specific  problem  of  developing  a  precise 
and  adequate  empirical  concept  of  statistical  evidence,  the 
preceding  sections  give  an  elementary  and  self-contained 
demonstration  that  this  pir'oblem  provides  an  intriguing  and 
significant  mystery:   NotX'jithstanding  the  evident  fact  that 
statistical  evidence  is  very  vjidely  and  effectively  interpreted 
and  used  in  scientific  work,  no  precise  adequate  concept  of 
statistical  evidence  exists  —  and  none  can  exist,  in  the  sense 
that  precise  versions  of  several  very  plausible  widely-accepted 
minimum  conditions  for  adequacy  of  such  a  concept  are  logically 
incompatible!  The  mathematician's  natural  response  to  a  disclosed 
contradiction,  that  of  selecting  consistent  subsets  of  conditions 
on  which  to  develop  consistent  theories,  is  not  available  here. 
So  long  as  the  incompatible  criteria  appear  to  be  appropriate 
mathematical  expressions  of  respective  aspects  of  an  extra- 
mathematical  entity  (in  this  case  the  more-or-less  coherent,  raore- 
or-less  shared,  more-or-less  explicit  body  of  concepts  of  empirical 
statistical  evidence  which  are  a  part  of  the  structure  and  process 
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of  science),  so  long  does  the  incompatibility  express  an  evident 
anomaly  in  that  entity.   In  the  vjriter's  opinion  this  evident 
anomaly  is  a  substantial  one.   It  seems  worthy  of  broad  serious 
intellectual  curiosity;  and  particularly  for  those  working  in 
theoretical  and  applied  statistics^  consideration  of  it  can  be  a 
salutory  antidote  for  over-facile  views  on  the  relations  among 
concepts  and  betv;een  concepts  and  scientific  practice. 

This  anomaly,  vjhich  is  the  evident  outcome  of  the  problem 
of  an  adequate  precise  concept  of  statistical  evidence,  is  a 
vantage-point  for  an  interesting  and  orderly  perspective  on  the 
pattern  of  historical  development  of  theories  of  statistical 
inference,  since  in  most  periods  the  strongest  Impetus  for  new 
developments  or  changes  has  been  the  specific  inadequacies  of 
current  concepts  and  techniques  for  treating  statistical  evidence. 
Such  a  perspective  is  presented  briefly  in  Section  IJ  below,  and 
also  summarized  graphically  in  Figure  2,  p.  33a. 
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P:'\RT  II.  Further  Axioms  for  Statistical  Evidence. 

The  follovjing  Sections  7-11  complement  Part  I  by  introducing 
in  elementary  mathematical  form  further  axioms  for  statistical 
evidence, with  derivations  of  all  implications  betueen  axioms,  as 
summarized  in  Figure  1,  p.  17a.   Of  these^S actions  7-9  are  an 
adequate  preparation  for  reading  the  discussion  of  Part  III. 
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7.   Mathematical  Equivalence,  (M'). 

If  two  experiments  differ  only  in  the  manner  of  labeling 
of  sample  points,  (i.e.  only  by  a  one-to-one  transformation  of 
sample  space  preserving  probabilities),  they  are  usually  taken 
as  equivalent  for  all  purposes.   For  example 


^2 


^.1    .9'> 


! 


,7       '^-  i 


is  thus  equivalent  to  E,  above.   For  this  reason,  corresponding 
outcomes  of  such  equivalent  ejcperiments  are  usually  considered 
as  providing  equivalent  statistical  evidence.  For  example  (Ep,l) 
is  thus  equivalent  to  (E-,,2).   This  concept  may  be  expressed 
informally  as  "irrelevance  of  manner  of  labeling  outcomes,"  and 
suggests  the  simplest  and  weakest  among  the  several  axioms  for 
statistical  evidence  to  be  considered  here.   It  suffices  for  our 
purpose  to  consider  an  axiom  e:cpressing  only  a  special  small  part 
of  the  concept  of  equivalence  just  indicated: 

(M'):  Axiom  of  mathematical  equivalence;  If  x'  and  x"   are 
possible  outcomes  of  any  discrete  experiment  E,  i;ith 
identical  probability  functions  fCx',©)  =  f{x",Q),  9  e  O, 
then  Ev(E,x')  =  Ev(E,x"). 


For  example,  under  (M')  we  have 

\       //.I   .1   .8\    \ 
)  ,  1  )  =   Ev  f i  )  ,  2) 


//.I  .1  .3\    \       //.I  .1  .8\ 
Ev  (  i         ;  ,  1  )  =  Ev  ( ( 

^.4  A     .2  Vv.4  A 
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8.   A  Weak  Concept  of  "Irrelevant  Noise";   the  Sufficiency  Axiom, (S) 

Let  E  be  any  specified  discrete  experiment,  \vlth  sample  space 
S  and  probability  function  f(x,0),  9  e  O.  Let  E  be  augmented  as 
follows:   If  and  when  a  certain  outcome  x'  of  E  Is  observed,  an 
observation  y  Is  taken  on  a  random  variable  Y  which  has  possible 
outcomes  1,  2,  with  respective  known  probabilities  c,  1-c. 
When  any  outcome  of  E  other  than  x'  Is  observed  no  auxiliary 
observation  is  taken.  The  aug5:nented  experiment  is  an  experiment 
E  with  sample  space 


S  =  [x|xeS,  X  --f-   x',  or  x  =  (x',1)  or  (x',2)}  , 


and  probability  function 

/ 


f(x,Q),  x  €  S,  x  ?^  x»  , 


;(x,Q) 


( 

^  cf(x',0),  X  =  (x',1)  , 

(1-c)  f(x',0),  X  =  (x',2)  . 


\ 

Since  the  auxiliary  random  variable  Y  has  a  known  distribution 
independent  of  9,  an  observed  value  such  as  y  is  considered  by 
many  statisticians  as  representing  only  recognizable  added 
"noise,"  irrelevant  to  9  and  thus  irrelevant  to  Ev(E  ,(x',y)); 
and  thus  Ev(e'\  (x',y) )  is  considered  equivalent  to  Ev(E,x'). 
This  concept  of  "Irrelevance  of  recognizable  pure  noise"  for 
statistical  evidence  may  be  formulated  as: 

(S):  Sufficiency  axiom;   Let  E  and  E  be  any  discrete  experiments 
with  common  parameter  space  O.  Let  one  possible  outcome  x' 
of  E  have  probabilities  f(x',©)j  let  two  outcomes  (x',1). 
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(x',2)    of  E     have  resioective  probabilities   cf(x',6)> 

(l-c)f  (:c',9) ,   0  €  O,   with  c  l<no\m.     Let  the  remaining  possible 

outcomes  be  in  one-to-one  correspondence  with  common  labels 

X  (x  7^  x')  and  common  respective  probabilities  f(x,Q). 

Then  EvCE^",  {x',1) )  =  Ev(E,x'). 

An  example  of  (S)  of  the  simplest  sort  is: 


Ev   '        ) ,  1    =   Ev  ),  1 


'  • 


.7^'    ^  V.2  .1   .7 


(Under  the  conditions  of  (S),  the  statistic  x  in  E  provides  the 
simplest  sort  of  example  of  a  sufficient  statistic.) 

Lemma.   (S)  implies  (M'). 

The  proof  is  immediate  upon  talcing  c  =  1/2. 
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9.  The  Condltlonallty  Axiom,  (C). 

Let  E, ,  Ep  be  any  discrete  experiments  with  cominon  parameter 
space  O,  vilth   respective  sample  spaces  S-,  =  [x,  } ,  Sp  =  (Xp],  and 
respective  probability  functions  f-,(x,,©),  fp(xp,©)  .  Let  E  be 
the  "mixture"  experiment  defined  as  follows:   an  auxiliary  random 
variable  Y  with  outcomes  1,  2,   with  respective  known  probabilities 
c,  1-c  (not  depending  upon  9)  is  observed.   If  Y  =  1  is  observed, 
then  E^  is  carried  out  and  some  outcome  x,  is  observed;  if  Y  =  2, 
then  Ep  is  carried  out  and  some  Xp  observed.  Thus  outcomes  of  E 
have  the  form  (E-,,x,),  x,  €  S-,,  or  (Ep,Xp),  Xp  e   Sp;  and  respective 
probabilities  cf-,(x,,9)  and  (l-c)fp(Xp,9)  . 

The  notion  that  Ev(E  ,{E-,,x-,))  is  the  same  as  Ev(E,,Xn)  may 
be  described  informally  as  "irrelevance  of  experiments  vjhich  might 
have  been,  but  were  not,  carried  out"  and  as  "appropriateness  of 
conditional  interpretations  of  evidence."  This  conditionality 
concept  has  been  discussed  at  length  (Birnbaum  (19'52)  and  references 
therein);  it  may  be  formulated  as: 

(C):  Conditionality  axiom:   If  e'  is  a  mixture  of  two  discrete 

experiments  E, ,  Ep,  with  given  respective  probabilities  c,  1-c 
(independent  of  9),  then  for  any  x-^,    Ev(E  ,(E^,x^))  =  Ev(E^,x-j_). 

An  example  of  (C)  of  the  simplest  sort  is: 


Ev 


r 

A 

.9 

.1 

,  1 

•  ^ 

•  V 

\ 

/ 

=  Ev 


1 
2 

V  L 


.9  .1 
.5  .7 


1 

2 


-        \ 

.6     .4' 

,   1 

A     .6 

J 

(It  is  readily  verified  that  the  2x4  stochastic  matrix  in  the 
right  member  represents  the  equal-weighted  mixture  of  experiments 
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/.9   .1\         /.6   .4  \ 

(        I    and    f        /  . ) 
V.  .3   .7  '■         ^  .4   .6  / 

Lemma.   (C)  and  (M')  Jointly'-  are  equivalent  to  (L) . 

t        t 

Proof.  Let  (E^,x-,),  (Ep,Xp)  be  any  Instances  of  statistical 

evidence  with  common  parameter  space,  ivith  probability  functions 
satisfying  f^(x^,9)  =  cf^ir..^,'^),    ©  e  O,  for  some  positive  c; 
v'ithout  loss  of  generality  i;e  assume  c  _^  1.  Consider  the  mixture 
experiment  e"  of  E-,,  E^,  with  respective  probabilities  l/{l+c), 
c/(l+c).   We  have  by  (C)  that  Ev(e",  (E^,x-j^) )  =  Ev(E,,x^)  and 
Ev(E  ,{E^,x^))   -   Ev(Ep,X2) .   In  E  ,  outcome  (Ep,Xp)  has  probabili- 
ties (c/(l+c))  fp(xp,9),  and  outcome  (E-,,x-,)  has  probabilities 
(1/(1+0))  f-,  (x-,,0)  ;  these,  bjr  the  original  assumption,  are  equal 
for  each  ©en.  Kence  by  (M'),  Ev  (  e'""' ,  ( E^ ,  x^ ) )  =  Ev(e'"',  (E2,X2) ) . 
Thus  Ev(E,,x,)  =  Ev(Ep,X2),  completing  the  proof  that  (C)  and 
(MM  imply  (L). 

It  is  obvious  that  (L)  implies  (C)  and  (M'). 
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10.  A  Concept  of  Irrelevant  Censoring,  (Ce) . 

Let  x'  be  a  specific  possible  outcome  of  any  discrete 
experiment  E,  with  probability  function  f (x,0) .   Let  E'  be  an 
experiment,  which  may  be  described  as  a  "censored"  version  of  E, 
in  which  an  ovitcome  also  labeled  x*  has  the  same  respective 
probabilities  f(x',©)  as  in  E,  and  there  is  just  one  other 
possible  outcome  x  ,  for  which  the  respective  probabilities 
are  necessarily  l-f(x',©).   (E'  is  related  to  E  by  the  particular 
many-to-one  ("censoring")  transformation  of  sample  space  vjhlch 
preserves  x*  and  carries  all  other  sample  points  of  E  into  x  . 
E'  is  "less  informative"  than  E,  in  the  sense  defined  in  the  theory 
of  comparison  of  experiments.) 

We  now  compare  Ev(E,x')  with  Ev(E',>^'):   E'  is  equivalent 
to  an  experiment  in  which  E  is  carried  out,  and  the  outcome  is 
described  only  as  either  "x'"  or  "not  x'."  In  the  case  of  the 
outcome  {E^,x^),    then,  one  is  in  exactly'-  the  position  of  an 
experimenter  \iho   has  carried  out  E  and  observed  x',  except  in 
the  hypothetical  respect  that,  _if  x'  had  not  occurred  in  E^ ,  one 
would  then  have  obtained  only  the  incomplete  Information  "not  x*." 
Many  statisticians  find  it  in  accord  with  their  concepts  of 
evidence  to  consider  this  hyi^othetical  distinction  irrelevant, 
and  accordinglj'-  consider  Ev(E',x')  and  Ev(E,x*)  equivalent  in 
such  cases.  The  point  can  be  further  illustrated  in  terms  of  a 
non-statistical  example  (a  non-probabilistic  analog  of  the  example 
employed  by  Pratt  (19'^1»  19^2)  in  originating  this  concept  and 
shovjing  its  consequences):   If  an  accurate  voltmeter  gave  a  reading 
of  87,  does  it  matter,  for  the  interpretation  and  usefulness  of 
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this  reading  (assumed  error-free),  whether  the  meter's  range  was 
bounded  by  1,000,  or  bounded  by  100?  In  the  latter  case,  readings 
between  100  and  1,000,  which  might  have-  occurred  but  in  fact  did 
not,  would  have  been  indistinguishable. 

The  concept  that  such  hjrpothetical  distinctions  are  irrelevant 
to  the  evidence  in  question  may  be  expressed  informally  as 
"irrelevance  of  censoring  which  might  have,  but  in  fact  did  not, 
affect  an  observed  outcome."  Since  the  structure  of  E'  is 
determined  by  just  the  function  f{x',Q),  ©  e  O,  this  concept  may 
be  formulated  as: 

(Ce):  Axiom  of  irrelevant  censoring:   For  any  specified  outcome 
x'  of  any  discrete  ejcperiment  E,  Ev(E,x')  is  characterized 
fully  by  just  the  lunction  f(x',©)*  ©  e  O,  without  other 
reference  to  E  or  x'. 

An  example  of  (Ce)  of  the  simplest  sort  is: 
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Lemma.   (Ce)  implies  (M'). 

Proof.  In  E,  let  x',  x"  be  such  that  f(x»,©)  =  f(x",0),  9  €  O, 
as  in  the  condition  of  (M').   Then,  assuming  (Ce),  we  have 
Ev(E,x')  =  Ev(E,x"),  the  conclusion  of  (M'). 

Lemma.   (Ce)  and  (S)  jointly  are  equivalent  to  (L) . 

Proof.   Assuming  (Ce)  and  (S),  consider  any  (E,x')  and  (E  ,y') 
for  which  f(x',9)  =  cf*(y',©)»  ©  £  O,  c  >  0.   If  c  =  1,  (Ce)  gives 
that  Ev(E,x')  =  Ev(E  ,yM.  If  c  ?^  1,  we  may  assume  without  loss 
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of  generality  that  0  <  c  <  1.   (If  c  >  1  we  could  urite 
f*(y',@)  =  (1/c)  f(x',9)  where  0  <   (1/c)  <  1.)   Consider  the 
auxiliary  random  variable  Z  with  outcomes  1,  2,   with  respective 
probabilities  c,  1-c;  Z  is  to  be  observed  only  if  and  vjhen 
outcome  y'  of  E  is  observed.   In  the  augmented  experiment  E  , 
the  outcome  (y',1)  has  probabilities  cf  iy\9) ,    equal  to  those 
of  x'  of  E;  hence  by  (Ce)  we  have  Ev(E**, (y ' ,1) )  =  Ev(E,x'). 
And  by  (S),  Ev(e'"'*,  (y ' ,  1) )  =  Ev(E*,y').  Hence  Ev(E,x')  =  Ev(E*,y'), 
completing  the  deduction  of  (L) .   The  converse  is  obvious. 

11.   A  Concept  of  Irrelevance  of  Some  H:^'pothetical  Continuations 
of  Sequential  Experiments,  (H). 

Consider  any  two  sequential  (discrete)  experiments  E, ,  Ep, 
v;lth  common  parameter  space  O,  in  which  the  first  stages  have  the 
same  form.  More  specifically,  suppose  that  in  each  experiment  the 
first  stage  consists  of  observing  a  random  outcome  Xn  taking  just 
the  values  ::'  or  x"  with  respective  probabilities  g(x',©), 
g{x",©)  =  1  -  g(x',9),  9  €  O,  with  each  experiment  terminating 
in  the  first  stage  if  and  only  if  X-,  =  x'  is  observed.   If  X,  =  x", 
both  experiments  continue  but  vjith  different  rules  for  further 
observation  and  termination. 

Let  us  consider  and  compare  Ev(E,,l)  and  Ev(E2,l).   Suppose 
that  two  scientists  independently  investigating  the  same  subject- 
matter,  represented  by  the  common  parameter  space  O,  adopt  two 
different  e:cperimental  procedures  represented  by  E-,  and  Ep. 
Suppose  that,  unknovm  to  each  other,  they  entrust  the  execution 
of  their  respective  experiments  to  the  same  laboratory  technician. 
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who  notices  the  common  form  of  the  first  stages  of  E,  and  Ep. 
Suppose  also  that  the  technician  appreciates  that  he  can  economize 
without  Invalidating  either  experiment  by  taking  a  single  observa- 
tion on  X-j^,  rather  than  one  for  E,  and  another  for  Ep,  and  does  so, 
and  happens  to  observe  X^  =  x'.  He  properly  reports  to  the  first 
scientist  the  statistical  evidence  (E-|,x')  and  to  the  second 
scientist  the  formally  different  statistical  evidence  (E^^x'), 
although  of  course  the  two  reports  are  descriptions  of  a  single 
physical  experimental  situation  and  outcome. 

In  such  a  case  many  statisticians  and  scientists  vjould 
consider  Ev(E^,x')  and  Ev(Ep,x«)  equivalent,  and  would  consider 
as  Irrelevant  to  the  evidence  in  question  the  physically-hypothetical 
distinction  that  J^  X^  =  x'  had  not  occurred  then  the  further 
physical  realizations  of  the  respective  experiments  vjould  in 
general  have  taken  different  forms.  The  latter  concept  may  be 
expressed  Informally  as  "irrelevance  of  differences  between 
sequential  sampling  rules  which  might  have,  but  in  fact  did  not, 
affect  an  observed  outcome." 

A  further  illustration  of  this  concept  is  provided  by 
considering  the  situation  of  the  technician  in  the  example  on  the 
assumption  that  he  also  has  an  independent  scientific  interest  in 
the  subject  matter.  He  must  carry  out  a  definite  sequential 
experiment  which  we  denote  by  E-,  which  begins  with  an  observa- 
tlon  of  X,  as  its  first  stage  and  which  terminates  there  only  if 
X,  =  x'.  In  the  remaining  case  his  experiment  continues  according 
to  some  definite  plan  as  required  to  complete  both  E,  and  Ep. 
(In  general  E^^  will  be  more  informative  than  either  E^  or  Ep 
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since  in  general  E^  requires  further  observations  after  the  termi- 
nation  of  E-,  or  Ep.)   Thus  in  the  case  of  outcome  X-,  =  x'  the 
technician  obtains  the  statistical  evidence  (E^,x')  vjhich  is 
mathematically  distinct  from  (E-,,x')  and  from  (Ep,x')«   Considera- 
tions like  those  above  suggest  the  equivalence  of  Ev(S-;,x')j 
Ev(E^,x')*  and  Ev(E^,x'). 

This  concept  may  be  formulated  as: 

(H):  Axiom  of  irrelevance  of  hypothetical  sequential  continuations. 
Let  the  distinct  (discrete)  sequential  experiments  E,,  Ep, 
with  common  parameter  space  O,  have  identical  first  stages 
consisting  of  an  observation  on  a  random  outcome  X,  taking 
Just  the  values  x',  x",  with  respective  probabilities 
g(x',Q),  l-g(x',9),  9  €  O.   Let  X-j^  =  x'  be  a  termination 
point  for  each  experiment.   Then  Ev(E-,x')  =  Ev(Ep,x'). 

An  example  of  the  simplest  sort  is  the  following:   Let  0  =  .01 
or  .99  only;  let  g(l,©)  =  ©,   g(2,9)  =  1-9.   In  E^  let  X^  =  2  be 
followed  by  a  second  and  final  observation  on  a  random  variable 
Xp  independent  of,  and  having  the  same  distribution  as,  X^.   In 
Ep,  let  X,  =  2  also  be  a  termination  point.   (Thus  Ep  is  not 
sequential;  but  it  can  be  made  sequential  in  a  formally  genuine 
but  trivial  sense  by  adding,  when  X-]_  =  2,  a  further  observation 
on  a  random  variable  Y  with  known  distribution  (not  depending 
upon  0  and  thus  providing  no  information  when  observed).) 

Lemma.   (H)  is  equivalent  to  (Ce). 

Proof.   Clearly  (Ce)  implies  (H),  since  in  the  latter 's  assumptions 
(E,,l)  and  (Ep,l)  have  identical  probabiHtyfunctions  g(l,9).  To 
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prove  the  converse,  let  (E,,x')^  (Ep,x')  be  any  instances  of 
statistical  evidence  with  common  parameter  space  O  and  probability 
functions  such  that  f^(x',9)  =  f  (x',6)>   0  e  O.   Let  E,  be  the 

"censored"  version  of  E,,  with  outcomes  y  =  y(x)  =  x'  if  x  =  x' 

t 

and  y  =  0  otherwise;  E,  has  the  p.d.f. 

^f^(x',0)  if     y  =  1   , 

g(y,©)     =    /  Geo. 

I     1-f,  (x',9)  othervjise. 


ti 

Let  E-,  be  the  "truncated"  version  of  E,  from  which  possible 

outcome  x'  is  deleted  and  other  outcomes  have  corresponding 

ir 

probabilities;  E  has  the  p.d.f. 

f-,(x,0)  „ 

h,  (x,0)   =  = ,    X  €  S  ,   0  €  O  , 

^  l-f3^(x',9) 

where  S"  denotes  S  with  x'  deleted.  Let  E,  be  a  two-stage 
sequential  ercperiment  defined  as  follows:   (1)  E^  is  carried  out, 
and  if  y  =  1  is  observed  the  experiment  E,  is  terminated,  but 

otherwise:   (2)  E-.  is  carried  out.  The  possible  outcomes  z  of 

* 

E  are  thus  denotable  as  z  =  x'  (representing  termination  in 

stage  (1)  vjlth  observation  of  y  =  1),  and  z  =  x  for  each  possible 
value  X  other  than  x'  (representing  each  possible  termination  in 
stage  (2);  thus  the  sample  space  of  E,  is  identical  with  that  of 
E-,  .   It  is  readily  verified  that  the  probability  function  f^(z,0) 
of  E*  is  identical  with  the  probability  function  f-|^(x,9).   Let 
Ep,  Ep,  and  Ep  be  defined  analogously  in  terms  of  Ep.   Then  the 
sequential  experiments  E,,  Ep,  satisfy  the  assumptions  of  (?J), 
and  assuming  (K)  we  have  that  Ev(E-.,x')  =  Ev(E2,x').  By  the 
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thematical  equivalence  of  (E^,x')  with  (E^,x'),  and  (E^^x')  with 
(E2,x'),  we  have  also  Ev(E^,x')  =  EvCEg^x'),  the  conclusion  of  (Ce), 
completing  the  proof. 

Since  (Ce)  and  (S)  were  sho^^;n  above  to  be  jointly  equivalent 
to  (L),  we  see  that  (H)  and  (S)  are  Jointly  equivalent  to  (L) . 
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PART  III.  Discussion. 

12.  Evidence,  Error-Probabllltles,  and  Inductive  Behavior 
(Decision-Making) . 

The  leading  theory  of  statistical  inference,  that  of  Neyman 
and  Pearson,  draws  its  strength  from  the  fact  that  it  bases  itself 
rather  exclusively  on  the  highly  successful  modern  theory  of 
probability  and  its  applications.   In  contrast  with  other  theories 
of  statistical  inference,  it  avoids  introducing  basic  interpreted 
terms  other  than  error-probabilities,  and  gives  to  the  latter  the 
frequency  interpretation  widely  accepted  for  probabilities 
generally. 

The  very  plausible  vjidely-held  concept  that  a  systematic 
connection  of  some  kind  must  e::ist  between  suitably  specified 
evidential  interpretations  and  error-probabilities  has  been 
expressed  in  the  present  paper  in  the  criterion  (U)  stated  in 
Section  4  above.  On  the  other  hand,  each  of  the  very  plausible 
concepts  expressed  in  the  various  axioms  formulated  above  implies 
a  criticism  and  rejection  of  the  dominant  view  that  error- 
probabilities  can  serve  as  the  sole  basic  term  in  an  adequate 
mode  of  evidential  interpretations.   (These  incompatibilities 
have  been  discussed  in  detail  by  the  writer  (1961,  1962)  and 
others  cited  therein.) 

The  incompatibility  in  principle  between  evidential  interpreta- 
tions typically  made  in  current  practice  and  the  most  widely 
accepted  theoretical  basis  for  that  practice  is  also  illustrated 
very  sharply  in  another  way,  by  reference  to  recent  advances  hailed 
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by  Nejrman  as  "breakthroughs  in  the  theory  of  statistical  decision 
making."  Neyman  (I962,  p.  22)  writes  "The  essence  of  Robbins' 
second  breakthrough  (of  1950)  is  the  information  that,  if  a 
statistician  deals  simultaneously  with  a  large  number  N  of 
identical  problems  of  testing  hypotheses  and  wants  to  diminish 
the  overall  e:q)ected  frequency  of  errors  of  both  kinds,  then, 
at  least  in  certain  circumstances,  he  can  bring  this  frequency 
below  the  level  attainable  through  N  independent  applications  of 
the  most  povjerful  test.  The  resulting  gain  may  be  Impressive." 
This  quotation  can  be  paraphrased  to  supply  an  analogous  interpre- 
tation of  Stein's  (1955)  result:   If  a  statistician  deals 
simultaneously  with  N  (N  _>  5)  identical  problems  of  point-estimation, 
with  independent  known  normal  error-distributions,  and  wants  to 
diminish  the  overall  expected  mean  squared  error,  he  can  do  so 
by  replacing  the  classical  estimates  (the  respective  independent 
sample  means)  by  estimates  of  respective  means  each  of  which  in 
general  depends  on  other  samples  besides  that  whose  distribution 
Is  determined  by  the  estimated  mean." 

(It  is  important  to  appreciate  that  although  the  parameter 
spaces  of  the  respective  experiments  are  assumed  to  have  the  same 
form,  the  subject-matter  under  investigation  is  assumed  to  be 
distinct  in  the  respective  experiments.  If  the  subject-matter  were 
the  same,  the  overall  experiment  would  simply  take  the  form  of  N 
independent  replications  of  one  experiment,  and  there  would  be  no 
question  of  compounding  to  be  considered.  Of  course  the  assumption 
of  common  forms  is  made  to  give  a  simple  case  and  is  not  an 
essential  restriction  in  the  compounding  approach.) 
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A  formally  valid  application  of  the  Stein's  compounding 
method  would  be  to  "improve"  published  lists  of  various  physical 
constants  (when  measurement  errors  could  be  assumed  Icnown,  normal, 
and  independent  and  assigned  a  common  scale-unit)  by  recomputing 
them  in  the  indicated  pooled  or  compounded  way.  No  barrier  nor 
qualification  concerning  such  applications  is  to  be  found  either 
in  the  formal  assumptions  or  in  Neyman's  recommendations  for  very 
broad  application  of  these  methods.  However  an  application  like 
that  described  vjould  be  grossly  incompatible  with  unformalized 
but  firmly  held  concepts  of  many  scientists  and  statisticians 
concerning  the  autonomy  of  units  of  experimental  evidence  arising 
in  natural  or  experimental  situations  regarded  as  unrelated. 

(An  exposition  of  the  compounding  approach  suitable  for 
early  inclusion  in  any  course  introducing  the  mathematics  of 
decision  theory  is  that  of  Robbins  and  Samuel  (I961).) 

It  is  interesting  to  note  that  the  likelihood  axiom  (apart 
from  questions  of  interpreting  likelihood  functions)  entails  an 
automatic  provision  for  a  concept  of  autonomy  between  units  of 
Independent  statistical  evidence  concerning  distinct  subject-matters. 
The  notion  of  independent  units  of  statistical  evidence  requires 
formalization  in  the  product  rule  of  probability  for  independent 
events,  corresponding  here  to  a  product  of  density  fiuictions  for 
respective  independent  saunples;  the  notion  of  distinct  subject- 
matters  requires  over-all  formalization  in  a  parameter  space  which 
is  the  cartesian  product  of  respective-subject-matter  parameter 
spaces;  and  the  result  is  a  single  overall  model  of  an  experiment 
with  the  features  mentioned.  The  product  relation  entails  that 
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each  likelihood  function  of  an  outcome  of  this  experiment  will  have 
the  recognizable  form  of  a  product  of  "marginal"  likelihood  functions, 
each  representing  evidence  concerning  one  distinct  subject-matter. 

13.   General  Discussion  of  Concepts  of  Statistical  Evidence. 

Some  of  the  axioms  for  statistical  evidence  formulated  above, 
and  related  concepts,  have  sometimes  been  referred  to  previously 
by  the  present  v;riter  and  others  as  "principles"  rather  than 
axioms.  The  term  "axiom"  has  been  used  exclusively  here  to  avoid 
possible  misunderstandings  of  intention  or  possibly-misleading 
connotations  such  as  may  have  prompted  the  comments  of  Neyraan 
(1962,  pp.  24-5).  Perhaps  it  is  vjorth  remarking  that  here  as 
elsewhere  axioms  have  been  formulated  to  characterize  a 
mathematical  subject-matter,  and  their  relations  and  implications 
developed  in  the  mathematical  way,  independent  of  possible  interpre- 
tations and  applications.  And  here  as  elsewhere,  the  adequacy  of 
each  proposed  interpretation  or  application  of  an  axiom  or  a 
mathematical  consequence,  in  relation  to  someone's  concepts  or 
(extra-mathematical)  theories  or  practical  purposes,  is  a  quite 
distinct  question.   In  the  latter  connection,  the  subject-matter 
of  concepts  of  statistical  evidence  has  of  course  its  own 
difficulties  and  obscurities,  some  of  them  of  very  long  standing; 
if  this  subject-matter  may  be  compared  broadly  with  other 
subject-matters  of  applied  mathematics,  it  is  perhaps  marked  by 
the  manner  and  degree  in  vjhich  it  is  conceptual  rather  than 
physical. 


I     'ioi'L-v.^iAc:'''   ':.o    :■■;:■  j:i!^;o^q  £  'Vo  r<':rol   e 


iLi^'l 


r..ic.i 


2*00 


"X:J"i3o" 


as 


c'  acrl 


jnc  r^.P"'  cr^r!'. 


:.tbJ 


r*  T         "^  '^  ^  ** 


35a 


bO 

•H 

pi, 

• 

Oh 

~ — - 

o 

*^— ' 

CQ 

O 

•H 

+^ 

, — . 

ra 

•H 

1 

-p 

1 

CO 

1 

4J 

1 

CO 

■— 

t— 1 

CD 

cri 

o 

O 

r; 

•H 

a> 

-P 

'D 

cd 

•H 

g 

> 

flj 

q; 

r*-< 

-P 

«M 

m 

o 

S 

to 

c 

-p 

^ 

a 

(D 

w 

O 

■p 

c 

C 

o 

CD 

o 

E 

a 

o 

o 

-p 

r-t 

q; 

CO 

> 

r- 

0) 

O 

-o 

•H 

-P 

<D 

Cfl 

> 

H 

•H 

CD 

ra 

Sh 

CO 

(D 

SZ 

o 

■P 

o 

•H 

3 

? 

0 

■D 

OJ 
<D 


^, ^ 

o 

<-^- 

.  .^ 

•H 

CD  ^ 

•\^-^ 

> 

torn 

OJ  o 

a 

c«  cr\ 

.,—v 

•a-=f  Lr-\ 

sz 

>  iH 

K 

rH  ONCTn 

0) 

ct5  — 

CO  rH  rH 

^ 

CO    _    - 

-*    — 

•^•^ 

CD 

\ 

/' 

A 

> 

\ 

,// 

\ 

/ 

o 

\ 

// 

3 

// 

T! 

\ 

// 

•H 

V      /' 

v/ 

<D                      ! 

>     — 

C 

! 

1; 

•H     -^^- 

O 

// 

-p  ^  in 

i 

•H 

i; 

a  o  ctn 

1 

m 

1 

I 

:3   -H  rH 

j 

•H 

1 

'O   >     •^ 

I 

O 

t 

C   C    crjCD 

1 

0) 

// 

o  -H  4-,  r<> 

■D 

/ 

ra          CD  On 

P^    CO  ^  rH 

cO   CO 
CD--' 

1 

-;-: 

^-- 

1 

\- .       — 

[3 — ■             -     - 

CO    fn 

! 

\ 

/ 

y 

(D    n    (DCX) 

i 

\ 

;z;  -H  o  cj 

\ 

<            -      4-- 

G  ON 
CO    CD  iH 
CO 

\ 

c 

o 

■\                    c^J    '-  LP, 

CO      -> 

i:^  5h 

■ 

\\                  vo   cr;  cj^. 

CO  (D 

Cl)    o 

L  _ 

vi/     J^^ 

!'■    ■  ~                          '* 

, . 

A                      :3  rH      CJN-— 
^\                    Cd      -v     rH 

pL,    Oh               ^' 

0 

O  — 

o 

*L                           ^    H        —    >5 

•  rH^ 

C 

Vv           cm:         CD 

CO  o  KN      ; 

CD 

"X           iM  <>,    x^ 

ON        ! 

TJ 

^\              -r-l  rH       O    3 

.  +  rH 

•H 

--^        \^          ft,  —    C?  E-. 

w  — 

> 
CD 

1 
1 

CD 

.—^ 

a:^--      '^<'-r-b  o   -= 

^  CO 

O 

^ 

rHCO         — ^           '^    -— ^ 

•^~-^ 

c 

-— iH 

—  o            -    . 

\ 

CD 

r-:\co 

C3^                        -^           X 

5^ 

\o  ^^ 

C  rH                                              -    ^ 

(U 

^-w 

c  — 

-    \  ^                 '— 

Cm 

rH 

CO 

c 

v_.  CD 

U  -P 

•H 

O 

cd  -P 

N       c,      ,,  C\ 

CO    oj 

O    CD 

^  (DO   rH 

0  rH 

&j    CO 

^:;  OJ     - 

>^  Q, 

CO 

CO  G-N     • 

Cti    Cd 

^:  o 

■r)  rH      • 

P3i-q 

!^  CD 

lii—      • 



,  s. 

S- 

... ^ 

^ 

—                                    ^' 

G 
cd 
•H 
CO 

>5 

•H 

to 

CD 

1  ^ 
o  pq 

pq 

s: 

— '  "? 


I  ! 


3^ 


Of  course  the  examples  and  interpretations  accompanying  the 
concepts  formulated  above  are  to  be  taken  as  illustrative,  and  as 
representing  the  mode  in  which  some  mathematical  statisticians 
have  expressed  and  attempted  to  refine  their  concepts  of  statistical 
evidence  by  formulation  of  axioms.   In  this  area,  each  interested 
person  is  peculiarly  able  to  practice  applied  mathematics 
independently'-,  and  to  be  in  principle  his  ovm   ultimate  authority 
concerning  the  adequacy  or  tenability,  interpretability  and 
usefulness  of  eacn  axiom,  each  consequence,  and  of  the  entire 
mathematical  approach  to  this  subject-matter.  Of  course  such 
independent  critical  consideration  of  basic  concepts  is  a  necessary 
condition  for  a  much-to-be-desired  genuine  concensus  on  such 
concepts. 

These  concepts  and  axioms  about  evidence  can  usefully  be 
regarded  in  each  of  three  v;ays:   (a)  in  critical  confrontation 
with  one's  own  concepts  of  evidence;   (b)  as  concepts  of  some 
prominent  theoretical  statisticians,  which  have  played  a  role  of 
some  significance  in  the  development  of  statistical  thought;  and 
(c)  as  a  body  of  mathematical  material  whose  structure  is  not 
without  interest  even  when  considered  apart  from  any   interpreta- 
tions and  extra-mathematical  implications . 

As  is  familiar  to  teaching  and  consulting  statisticians, 
students  and  users  of  the  Neyman-Pearson  approach  and  of  Neyman's 
concept  of  "inductive  behavior"  (or  decision-making  as  contrasted 
with  inference)  frequentl3^  meet  difficulties  through  trying  to 
relate  their  concepts  of  evidence  to  the  Neyman-Pearson  theory; 
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and  frequently  find  such  attempts  difficult  to  give  up  because 
for  many  individuals  concepts  of  evidence  are  among  the  strongest 
and  most  interesting  intuitive  concepts  engaged  by  study  or  use 
of  statistics.   Even  for  those  who  find  little  value  or  interest 
in  concepts  of  statistical  evidence,  a  positive  elementary  approach 
to  these  concepts  may  have  at  least  the  value  of  helpin:^  to  clarify 
and  crystalize  this  direction  of  thought  even  if  only  to  facilitate 
its  distinction  from  another  preferred  approach. 

In  this  spirit  it  seems  useful  and  interesting  to  consider 
broadly  the  pattern  of  historical  development  of  theories  of 
statistical  inference,  particularly  with  reference  to  the  impetus 
provided  at  each  stage  by  concern  for  a  more  adequate  concept  of 
statistical  evidence.  Such  a  pattern  is  indicated  schematically  in 
Fig.  2,   p. 35a,  Even  after  qualifications  about  possible  additions 
and  refinements  are  taken  into  account  (these  will  not  be  discussed 
here),  the  indicated  pattern  seems  to  have  substantial  accuracy 
and  interest. 

The  axioms  of  evidence  discussed  in  the  present  paper  are 
to  be  located  in  the  category  of  non-Bayesian  inference  in 
Figure  2.   They  may  be  said  to  stem  from  some  (but  not  all)  of 
R.  A.  Fisher's  theories  of  inference;  and  from  the  concepts  of 
evidence  of  some  theoretical  statisticians  oriented  more  closely 
to  the  Neyman-Pearson  theory.   Of  course  the  likelihood  concept 
is  also  included  in  Bayesian  inference  theory. 

The  ordering 

(B)   =>  (L)  <=>  (C)   =>  (S) 
Of  the  four  concepts  indicated  here  (sufficiency,  conditionality. 
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and  likelihood  as  concepts  of  evidence;  and  Bayes '  principle, 
denoted  b^^  (B)),  is  by  their  logical  strengths;  except  that  (C) 
and  (L)  are  ordered  by  their  evident  intuitive  and  logical 
strengths  as  these  seemed  several  years  ago,  before  their 
equivalence  uas  seen.   (We  tacitly  assume  at  this  point,  as 
in  previous  papers,  that  the  intuitively-trivial  a::iom  (M')  is 
adjoined  to  (C).)  Throughout  the  v/rltings  of  Fisher  and  Barnard, 
basic  importance  is  ascribed  to  both  (L)  and  (C),  but  \i±th 
apparently  different  scopes  and  significance.  Among  those  closer 
to  the  Nejiinan-Pearson  theorjr,  a  time  pattern  of  interest  in,  and 
tendency  to  accept,  successively  stronger  concepts  of  evidence  can 
be  discerned. 

One  of  the  earliest  applications  of  the  Neyman-Pearson  theory 
was  the  construction  of  binomial  confidence  intervals  (the  Clopper- 
Pearson  charts,  195^) •  It  seems  impossible  to  account  for  the 
non-use  of  auxiliary  randomization  variables  to  obtain  improved  or 
exact  results  in  the  terms  of  the  Neyman-Pearson  theory,  both  here 
and  in  apparently  all  subsequent  applications  of  that  theory, 
except  as  an  attempt  to  incorporate  a  sufficiency  concept  of 
evidence  vjithin  the  Neyman-Pearson  theory.   The  comments  of 
E.  S.  Pearson  himself,  discussed  by  Tukey  (19^2,  pp.  12-13),  seem 
to  support  this  vlevj  strongly. 

The  appropriateness  of  a  conditionality  concept  of  evidence 
was  stressed  by  Cox  (I958)  and  Tukey  (I958).  And  the  equal 
appropriateness  of  a  likelihood  concept  of  evidence  iras  seen  to 
be  entailed  by  the  present  vjriter's  demonstration  (19d1j  1962) 
that  (C)  implies  (L) . 
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The  pattern  of  these  developments  may  be  described  as  one 
of  re-emergence  of  successive  parts  of  the  Bayesian  concept  of 
evidence  on  non-Bayesian  ground.  The  neo-Bayesian  writers  (notably 
Savage  (195^!.»  19^1,  19^2,  195^)  and  Good  (1950))  have  contributed 
significant  net;  knowledge  of  possibilities  of  specification  and 
interpretation  of  decision  problems  and  related  utilities  and 
personal  or  subjective  probabilities;  and  in  this  context  a 
characterization  of  evidence  coinciding  formally  with  (L)  appears. 
(An  elementary  expository  version  of  the  central  neo-Bayesian 
result,  the  deduction  of  Bayes'  principle  from  axioms  characterizing 
rational  decision-making  behavior,  has  been  given  by  Pratt 
et  al  (1954).  A  detailed  analysis  of  the  structure  of  Savage's 
axioms  and  derivation  has  been  given  by  Morlat  (1959) •)  But 
those  questions  which  led  to  the  decline  and  disuse  of  the 
classical  Bayesian  approach,  the  questions  of  specification  and 
interpretation  of  prior  probabilities  in  terms  assimilable  in  the 
structure  of  empirical  scientific  v;ork,  rem-aln  unanswered. 
(This  problem  of  giving  adequate  interpretation  to  prior  probabili- 
ties is  replaced  by  a  "smaller"  problem  if  prior  information  and 
opinion  is  assumed  representable  by  a  "prior  likellhocd"  as 
described  by  Hudson  (1964).   For  those  inclined  to  accept  the 
likelihood  concept  for  experimental  evidence,  such  as  extension 
of  the  lieklihood  concept  would  seem  particularly  attractive;  but 
its  value  depends  of  course  on  an  answer  to  questions  of  interpret- 
ing likelihood  fimctions  in  general.) 

The  central  result  of  these  considerations  is  a  trilernma 
concerning  the  concept  of  statistical  evidence:   The  only  theories 


which  are  foinnally  complete,  and  of  adequate  scope  for  treating 
statistical  evidence  and  its  interpretations  in  scientific  research 
contexts,  are  Bayesian;  but  their  crucial  concept  of  prior  probabil- 
ity remains  vjithout  adequate  interpretation  in  these  contexts. 
Each  of  the  non-Bayesian  alternatives,  one  identified  ulth  the 
likelihood  concept  and  the  other  vjlth  the  error-probability  concept, 
seems  an  essential  part  of  anj^  adequate  concept  of  evidence,  but 
each  separately  is  seriously  incomplete  and  inadequate;  however 
these  cannot  be  combined  because  they  are  incompatible  (except, 
curiously,  in  the  simplest  restricted  case,  where  one  maj^  say 
that  a  thoroughly  satisfactory  concept  of  empirical  statistical 
evidence  and  its  interpretation  exist  in  miniature). 
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Appendix:   Further  Details  on  the  Axioms. 

1.  Pratt's  strikingly  simple,  plausible,  and  original  censoring 
concept  seems  to  be  the  best  elementary  intuitive  point  of 
departure  for  appreciation  of  the  major  part  of  the  likelihood 
axiom.  An  interesting  alternative  to  Sections  2  and  Z   above  is 
one  based  on  material  from  Sections  5,  S,  and  10,  presenting  just 
(Ce),  then  (S),  and  then  (L)  and  its  equivalence  to  (Ce)  and 
(S)  jointly. 


^-9 


il.   The  presentation  of  (i:)  in  Section  11  stems  from  the  v;riter's 
search  for  an  intuitively-direct  appreciation  of  Barnard's  sugges- 
tions that  the  likelihood  axiom,  particularly  when  vievjed  as  a 
concept  of  "irrelevance  of  stopping  rules,"  is  obvious.   The 
method  of  proof  of  eciuivalence  of  (F)  and  (Ce)  sho:;s  that  there 
is  no  substantial  mathematical  distinction  betvjeen  seo^uential 
experiments  and  others:   Each  possible  termination  point  of  any 
experiment,  sequential  or  not,  can  under  relabeling  be  interpreted 
as  a  possible  first-stage  termination  point  of  a  sequential 
experiment,  and  so  on.   This  is  not  at  all  incompatible  with, 
but  rather  confirms,  the  heuristic  value  frequently  found  in 
sequential  examples  in  discussions  of  concepts  of  evidence, 

ill.   The  notation  (C)  might  be  reserved  for  the  formulation  of 
previous  papers,  v;hich  expresses  the  conditionality  concept  in 
its  natural  ;full  scope,  namely  all  mixture  experiments.   The 
notation  (C')  might  be  used  for  the  vjeaker,  simpler  formulation 
which  suffices  in  Section  9  above,  wlere  only  mixtures  of  two 
components  are  considered.   (Repeated  applications  of  (C)  extend 
its  scope  to  mixtures  of  any  finite  number  of  components,  but  not 
to  a  countable  infinity  of  components.)   The  main  point  of  interest 
of  course  is  just  that  even  (C')  (the  formulation  of  this  paper, 
called  (C)  here)  suffices  with  (M')  to  imply  (L);  the  slightly 
weaker  assumptions  make  the  result  slightly  stronger. 

iv.   The  natural  scope  of  the  mathematical  equivalence  concept 
discussed  in  Section  7,  all  one-to-one  transformations  of  S,  might 
be  denoted  by  (M)  in  distinction  from  the  simpler,  weaker  (M') 
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which  sufficed  there.   Previousljr  (19o2,  pp.  277-")  this  concept 
was  described  but  not  formulated  axiomatically,  and  v.'as  tacitly- 
assumed  in  assertions  (unaccompanied  by  proofs)  that  (C)  implies  (S). 

V.   The  natural  scope  of  the  sufficiency  concept  is  all  sufficient 
statistics,  as  in  formulations  denoted  (S)  in  previous  papers. 
The  simpler,  ijeaker  formulation  v;hich  suffices  in  Section  8  above 
might  be  given  the  different  denotation  (S')-   (Repeated  applica- 
tion of  (S')  gives  an  equivalent  formulation  covering  all  sufficient 
statistics  in  experiments  v/ith  finite  (but  not  countably- Infinite) 
sample  spaces.)   It  seems  of  pedagogical  interest  that  the  concepts 
of  statistic  and  sufficient  statistic  were  not  introduced  explicitly 
in  Parts  I  and  II  above,  although  the  material  there  includes  what 
many  would  consider  the  principal  significance  of  the  sufficiency 
concept.   Similarly  presentation  of  (L)  and  its  implications  did 
not  require  explicit  consideration  of  the  likelihood  function  as  a 
sufficient  statistic. 
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